HybridTune: Spatio-temporal Data and Model Driven Performance Diagnosis for Big Data Systems
نویسندگان
چکیده
With tremendous growing interests in Big Data systems, analyzing and facilitating their performance improvement become increasingly important. Although there have much research efforts for improving Big Data systems performance, efficiently analysing and diagnosing performance bottlenecks over these massively distributed systems remain a major challenge. In this paper, we propose a spatio-temporal correlation analysis approach based on stage characteristic and distribution characteristic of Big Data applications, which can associate the multi-level performance data fine-grained. On the basis of correlation data, we define some priori rules, select features and vectorize the corresponding datasets for different performance bottlenecks, such as, workload imbalance, data skew, abnormal node and outlier metrics. And then, we utilize the data and model driven algorithms for bottlenecks detection and diagnosis. In addition, we design and develop a lightweight, extensible tool HybridTune, and validate the diagnosis effectiveness of our tool with BigDataBench on several benchmark experiments in which the outperform state-of-the-art methods. Our experiments show that the accuracy of abnormal/outlier detection we obtained reaches about 80%. At last, we report several Spark and Hadoop use cases, which are demonstrated how HybridTune supports users to carry out the performance analysis and diagnosis efficiently on the Spark and Hadoop applications, and our experiences demonstrate HybridTune can help users find the performance bottlenecks and provide optimization recommendations.
منابع مشابه
Context-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...
متن کاملمعرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی
In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...
متن کاملSpatio-temporal variation of wheat and silage maize water requirement using CGMS model
The Crop Growth Monitoring System (CGMS) has been applied for spatial biophysical resource analysis of Borkhar & Meymeh district in Esfahan province, Iran. The potentially suitable area for agriculture in the district has been divided into 128 homogeneous land units in terms of soil (physical characteristics), weather and administrative unit. Crop parameters required in the WOFOST simulatio...
متن کاملAssessment of Neonate's Congenital Hypothyroidism Pattern Using Poisson Spatio-temporal Model in Disease Mapping under the Bayesian Paradigm during 2011-18 in Guilan, Iran
Background: Congenital Hypothyroidism (CH) is one of the reasons for mental retardation and defective growth in neonates. It can be treated if it is diagnosed early. The congenital hypothyroidism can be diagnosed using newborn screening in the first days after birth. Disease mapping helps to identify high-risk areas of the disease. This study aimed to evaluate the pattern of CH using the Poisso...
متن کاملEvaluation of Tests for Separability and Symmetry of Spatio-temporal Covariance Function
In recent years, some investigations have been carried out to examine the assumptions like stationarity, symmetry and separability of spatio-temporal covariance function which would considerably simplify fitting a valid covariance model to the data by parametric and nonparametric methods. In this article, assuming a Gaussian random field, we consider the likelihood ratio separability test, a va...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.07639 شماره
صفحات -
تاریخ انتشار 2017